The Remedian: A Robust Averaging Method for Large Data Sets

نویسندگان

  • PETER J. ROUSSEEUW
  • GILBERT W. BASSETT
چکیده

It is often assumed that to compute a robust estimator on n data values one needs at least n storage elements (contrary to the sample average, that may be calculated with an updating mechanism). This is one of the main reasons why robust estimators are seldom used for large data sets and why they are not included in most statistical packages. We introduce a new estimator that takes up little storage space, investigate its statistical properties, and provide an example on real-time curve "averaging" in a medical context. The remedian with base b proceeds by computing medians of groups of b observations, and then medians of these medians, until only a single estimate remains. This method merely needs k arrays of size b (where n = bk), so the total storage is O(log n ) for fixed b or , alternatively, O ( n l ') for fixed k . Its storage economy makes it useful for robust estimation in large data bases, for real-time engineering applications in which the data themselves are not stored, and for resistant "averaging" of curves or images. The method is equivariant for monotone transformations. Optimal choices of b with respect to storage and finite-sample breakdown are derived. The remedian is shown to be a consistent estimator of the population median, and it converges at a nonstandard rate to a median-stable distribution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A robust aggregation operator for multi-criteria decision-making method with bipolar fuzzy soft environment

Molodtsov initiated soft set theory that provided a general mathematicalframework for handling with uncertainties in which we encounter the data by affix parameterized factor during the information analysis as differentiated to fuzzy as well as bipolar fuzzy set theory.The main object of this paper is to lay a foundation for providing a new application of bipolar fuzzy soft tool in ...

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Fuzzy Robust Regression Analysis with Fuzzy Response Variable and Fuzzy Parameters Based on the Ranking of Fuzzy Sets

‎Robust regression is an appropriate alternative for ordinal regression when outliers exist in a given data set‎. ‎If we have fuzzy observations‎, ‎using ordinal regression methods can't model them; In this case‎, ‎using fuzzy regression is a good method‎. ‎When observations are fuzzy and there are outliers in the data sets‎, ‎using robust fuzzy regression methods are appropriate alternatives‎....

متن کامل

A robust least squares fuzzy regression model based on kernel function

In this paper, a new approach is presented to fit arobust fuzzy regression model based on some fuzzy quantities. Inthis approach, we first introduce a new distance between two fuzzynumbers using the kernel function, and then, based on the leastsquares method, the parameters of fuzzy regression model isestimated. The proposed approach has a suitable performance to<b...

متن کامل

Hesitant q-rung orthopair fuzzy aggregation operators with their applications in multi-criteria decision making

The aim of this manuscript is to present a new concept of hesitant q-rung orthopair fuzzy sets (Hq-ROFSs) by combining the concept of the q-ROFSs as well as Hesitant fuzzy sets. The proposed concept is the generalization of the fuzzy sets, intuitionistic fuzzy sets, hesitant fuzzy sets, and Pythagorean fuzzy sets as well as intuitionistic hesitant fuzzy sets (IHFSs) and hesitant Pythagorean fuz...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008